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Background 

The  shortcomings  of  the  classic  low  order  and  geometrically  inflexible  particie- 
in-cell  methods  for  the  modeling  of  kinetic  plasma  phenomena  approach  are 
also  becoming  increasing  clear  when  one  attempts  to  model  many  types  of 
applications  of  mission  critical  importance  to  the  US  Airforce.  However,  the 
development  of  faster  and  more  accurate  alternatives  remains  a  major  challenge, 
since  such  developments  require  one  to  rethink  fundamental  choices  and  design 
criteria. 


In  this  effort  we  have  focused  the  research  on  the  development  of  computational 
methods  for  the  modeling  of  plasma  problems  dominated  by  kinetic  effects  and 
we  consider  methods  where  the  spatial  dimensions  are  discretized  by  using  a 
Discontinuous  Galerkin  method.  For  the  phase-space  dynamics  we  primarily 
seek  to  develop  a  new  generation  of  particle-in-cell  methods  for  solving  high¬ 
speed  problems  on  general  unstructured  grids  but  we  also  consider  challenges 
associated  with  efficiently  solving  the  Vlasov  equation.  Applications  include 
accelerators,  microwave  generators  and  laser-matter  interaction. 

While  such  methods  have  been  successfully  developed  during  beginning  of 
this  project,  these  techniques  tend  to  be  computationally  very  demanding,  A 
significant  part  of  the  project  is  therefore  also  devoted  to  the  development  of  new 
efficient  algorithms  and  their  effident  implementation  on  modem  computational 
platforms. 

In  the  latter  part  of  the  efforts  we  have  begun  to  focus  on  the  accurate  modeling 
of  complex  continuum  problems  to  lay  the  ground  for  the  development  of  effident 
and  accurate  hybrid  method  to  enable  the  future  modeling  of  complex  multi-scale 
problems. 

Overview  of  contributions. 

Throughout  the  effort,  we  have  pursed  the  development  and  analysis  of  a 
number  of  algorithms  for  kinetic  plasma  physics  modeling. 


Baac  algonthms.  We  have  developed  and  analysed  novel  and  fast  ways  of 
Identifying  and  depositing  charge  and  currents  to  the  nodes  of  the  DG  method 
in  a  PIC  simulation.  This  step  of  the  scheme  was  Identified  as  the  computational 
bottleneck  ar>d  we  developed  and  investigated  several  techniques,  including 
ones  exploring  tfie  use  of  a  Cartesian  grid  as  an  intermediary  between  the 
charge  deposition  and  the  unstructured  grids.  The  properties  of  these  techniques 
in  terms  of  charge,  energy,  and  momentum  conservation  have  been  carefully 
studied. 

We  implemented  and  tested,  for  the  first  time,  of  a  full  three-dimensional  PIC 
scheme  with  support  for  fully  relativistic  particles.  Tests  indicate  very  good 
performarK:e  in  terms  of  acoiracy. 


V 


Figure  1:  Examples  of  a  three-dimensional  OG  based  PIC  simulation.  The  test  is 
for  the  divergence  of  a  relativistic  bunch  of  25k  particles  moving  down  a  metallic 
tube,  with  the  top  figures  showing  the  bunch  of  particles  and  the  electric  field, 
respectively.  At  the  bottom  we  show  the  initio  conditions  in  the  grid  as  well  as 
a  comparison  between  the  exact  solution  and  the  computed  solution,  showing 
excellent  agreement. 

Efficient  tirrm-intagrahon  techniques:  A  significant  computational  bottlenedt  is 
the  temporal  integration  of  full  Maxwell-Vlasov  problem  on  general  unstructured 


grids.  This  is  not  different  from  the  pure  Maxwell  problem  where  complex 
geometries  or  poor  gridding  can  lead  to  very  small  time-steps  and  a  resulting 
significant  computational  cost  For  the  PIC  model,  this  becomes  a  much  more 
significant  problem  due  to  the  high  cost  of  the  particle  model  and  deposition. 
Furthermore,  whereas  the  small  cells  may  often  the  induced  due  to  the  grid,  the 
particle  dynamics  do  not  require  such  small  time-steps  to  accurately  capture  the 
important  time-scales. 

We  have  addressed  this  problem  in  two  different  ways.  In  the  first  one  we  use 
a  mixed  implidt-explidt  time-stepping  approach  in  which  the  Maxwell  part  is 
done  implidtiy,  hence  overcoming  the  small  cell  problem,  while  the  partides 
are  evolved  explidtiy.  TNs  also  effectively  deals  with  the  important  problem 
of  divergence  error  control  through  hypertx)lic  deaning  techniques.  Extensive 
tests  and  evaluations  confirm  that  this  is  accurate  and  effident  arKi  we  have 
demonstrated  a  potential  for  a  significant  speedup. 
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Rgure  2:  Graphs  showing  the  excellent  accuracy  of  the  energy  and  divergence 
with  increasing  time-steps  using  the  implidt-explidt  approach. 

An  alternative  to  this,  and  possibly  a  better  overall  solution,  is  the  development 
of  accurate  and  robust  local-time  stepping  methods.  This  has  first  been  done 
arKi  demor>strated  for  Maxwell’s  equations  and  subsequently,  using  a  novel 
extrapolation  technique  for  the  partide  dynamics  on  the  fine  cells,  to  the  full 
PIC  problem.  This  is  the  first  robust  and  high-order  accurate  local  time-stepping 
technique  for  PIC  problems  and  the  computational  gains  of  this  approach  are 
very  substantial. 


Partide  dynamics  and  models:  By  developing  a  self-consistent  1.5-dimensional 
test  case  that  allows  us  to  solve  it  using  both  a  PIC  approach  and  direct  Viasov- 
solver  tediniques.  we  have  been  able  to  carefully  analyze  accuracy  and 


resolution  requirements  of  the  RC  methods  being  developed.  These  studies 
confirm  that  the  more  advanced  particles  shapes  considered  in  this  work  are 
beneficial  over  simpler  ones  but  also  that  the  number  of  particles  needed  to 
accurately  model  the  phase-space  dynamics  typically  is  much  higher  than  is 
used  in  large  scale  applications.  These  studies  rely  on  the  ongoing  development 
of  a  database  of  PIC  benchmarks  for  benchmarking.  We  have,  after  discussions 
with  Kirtland  AFB  and  other  key  people  within  the  kinetic  plasma  physics 
modeling  community,  begun  the  development  of  such  a  database  of  standard 
test  cases  arKi  benchmarks.  The  beginning  of  the  database,  still  in  extensive 
development,  can  be  seen  as  http.//piki  tiker  net/wiki 

Accurate  and  efficfent  new  basis  for  Vla^g^mpcs:  The  direct  solution  of  the 
Vlasov  equation  is  computationally  very  expensive  due  to  the  high-dimensional 
nature.  The  development  of  efficient  and  compact  ways  of  representing  the 
velodty-dyrMmnics  is  therefore  of  primary  interest  to  reduce  the  overall  cost. 

Wte  have  developed,  carefully  analyzed  and  implemented  a  novel  rational 
basis,  which  Is  particularly  well  suited  for  the  Vlasov  equation.  The  basis  Is' 
defined  on  all  of  R,  allows  the  use  of  the  FFT  but  does  not  require  periodicity.  It 
maintains  spectral  convergence  and  extensive  analysis  and  tests  confirm  that 
it  is  superior  to  widely  used  alternatives  such  as  Hermite  functions  and  mapped 
Chebyshev  furrctions.  This  basis  has  been  implemented  and  used  in  a  Vlasov 
solver,  showing  excellent  results  and  performance  which  is  superior  to  standard 
approaches.  ' 


Figure  3;  One  the  left  we  show  results  for  a  foil  Vlasov  solver  used  to  model 
the  two-stream  instability  and  with  the  velocity  space  represented  in  different 
ways.The  new  Wener  basis  shows  excellent  long  time  agreement  with  a 
Hermite  based  method  which,  however,  is  substantially  more  expensive.  On  the 
right  we  show  a  direct  comparison  with  a  PIC  model,  illustrating  convergence  of 
the  two  approaches. 

Accaktr^inn  and  algorithm  devakximants  enabled  through  GPU  compufiny 
\Miile  the  algorithmic  changes  themselves  are  important  and  valuable,  the 
speedup  offered  by  the  use  of  Graphics  Processing  Units  (GPUs)  is  also  a 
substantial  effort.  This,  however,  require  a  detailed  retNnking  and  reformulation 


of  the  algorithms  to  fully  explore  this  potential. 

We  have  ported  the  full  Maxwell  solvers  to  GPUs,  resulting  in  significant 
speedup.  This  has  been  extended  to  also  include  local  time-stepping  and 
nonlinear  equations  such  as  Euler  equations.  These  developments  have 
illustrated  the  potential  for  1-2  orders  of  magnitude  increase  in  computational 
efficiency  at  very  limited  hardware  cost.  It  is  essential  to  appreciate,  however, 
that  this  is  not  achieved  through  simple  implementation  changes  but  requires 
deep  changes  in  the  algorithmic  structure  of  the  methods,  e.g.,  whereas  low 
order  methods  typically  used  in  PIC  modeling  may  see  some  benefit  by  using 
GPU  based  computing,  high-order  methods  used  here  are  much  better  suited  for 
this  and  offer  truly  outstanding  utilization  of  these  new  computational  platforms. 

Development  of  accurate  and  effident  wavs  to  deal  with  shocks.  When  aiming 
to  solve  more  complex  problems,  e.g.,  multi-scale  and/or  high-speed  flow 
problems,  the  emergence  of  shocks  or  very  steep  gradients  presents  a  significant 
challenge.  This  is  relevant  also  for  kinetic  plasma  physics  applications  in  such 
areas  as  laser  matter  interaction  or  complex  multiphysics  where  a  kinetic/ 
continuum  model  is  most  appropriate. 

For  this  purpose  we  have  developed  a  novel,  cell-local  shock  detector  for  use 
with  discontinuous  Galerkin  (DG)  methods.  The  output  of  this  detector  is  a 
reliably  scaled,  element-wise  smoothness  estimate  which  is  suited  as  a  control 
input  to  a  shock  capture  mechanism.  Using  an  artificial  viscosity  in  the  latter 
role,  we  obtain  a  DG  scheme  for  the  numerical  solution  of  nonlinear  systems  of 
conservation  laws.  We  have  thoroughly  justify  the  detector’s  design  and  analyze 
and  shown  its  performance  on  a  number  of  benchmark  problems. 
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